Frame Level Audio Similarity - a Codebook Approach

نویسندگان

  • Klaus Seyerlehner
  • Gerhard Widmer
  • Peter Knees
چکیده

Modeling audio signals by the long-term statistical distribution of their local spectral features often denoted as bag of frames approach (BOF) is a popular and powerful method to describe audio content. While modeling the distribution of local spectral features by semi-parametric distributions (e.g. Gaussian Mixture Models) has been studied intensively, we investigate a non-parametric variant based on vector quantization (VQ) in this paper. The essential advantage of the proposed VQ approach over stateof-the-art similarity measures is that the proposed audio similarity metric forms a normed vector space. This allows for more powerful search strategies, e.g. KD-Trees or Local Sensitive Hashing (LSH), making content-based audio similarity available for even larger music archives. Standard VQ approaches are known to be computationally very expensive; to counter this problem, we propose a multi-level clustering architecture. Additionally, we show that the multi-level vector quantization approach (ML-VQ), in contrast to standard VQ approaches, is comparable to state-ofthe-art frame-level similarity measures in terms of quality. Another important finding w.r.t. the ML-VQ approach is that, in contrast to GMM models of songs, our approach does not seem to suffer from the recently discovered hub problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visual tracking via bag of features

In this paper, we propose a visual tracking approach based on ‘bag of features’ (BoF) algorithm. First we use incremental PCA visual tracking (IVT) in the first few frames and collect image patches randomly sampled within the tracked object region in each frame for constructing the codebook; the tracked object then can be converted to a bag. Second we construct two codebooks using color (RGB) f...

متن کامل

Content-Based Music Recommender Systems: Beyond simple Frame-Level Audio Similarity Dissertation zur Erlangung des akademischen Grades

This thesis aims at improving content-based music recommender systems. Besides a general introduction to music recommendation and an in-depth discussion of evaluation methods of content-based music recommender systems, improvements on two different abstraction levels are considered in this thesis: The first and most obvious way to improve a content-based music recommender system is to improve t...

متن کامل

Novel Keyframe Extraction for Video Content Summarization using LBG Codebook Generation Technique of Vector Quantization

In the current era, most of the digital information in the form of multimedia with a giant share of videos. Videos do have audio and visual content where the visual content has number of frames put in a sequence. Most of the consecutive frames do have very little discriminative contents. In video summarization process, several frames containing similar information are needed to get processed. T...

متن کامل

Robust audio-codebooks for large-scale event detection in consumer videos

In this paper we present our audio based system for detecting “events” within consumer videos (e.g. You Tube) and report our experiments on the TRECVID Multimedia Event Detection (MED) task and development data. Codebook or bag-of-words models have been widely used in text, visual and audio domains and form the state-of-the-art in MED tasks. The overall effectiveness of these models on such dat...

متن کامل

Dominant Feature Vectors Based Audio Similarity Measure

This paper presents an approach to extracting dominant feature vectors from an individual audio clip and then proposes a new similarity measure based on the dominant feature vectors. Instead of using the mean and standard deviation of frame features in most conventional methods, the most salient characteristics of an audio clip are represented in the form of several dominant feature vectors. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008